Language independent search in MediaEval's Spoken Web Search task : Spoken Content Retrieval
Identifieur interne : 000322 ( France/Analysis ); précédent : 000321; suivant : 000323Language independent search in MediaEval's Spoken Web Search task : Spoken Content Retrieval
Auteurs : Florian Metze [États-Unis] ; Xavier Anguera [Espagne] ; Etienne Barnard [Afrique du Sud] ; Marelie Davel [Afrique du Sud] ; Guillaume Gravier [France]Source :
- Computer speech & language : (Print) [ 0885-2308 ] ; 2014.
Descripteurs français
- Pascal (Inist)
English descriptors
- KwdEn :
Abstract
In this paper, we describe several approaches to language-independent spoken term detection and compare their performance on a common task, namely "Spoken Web Search". The goal of this part of the MediaEval initiative is to perform low-resource language-independent audio search using audio as input. The data was taken from "spoken web" material collected over mobile phone connections by IBM India as well as from the LWAZI corpus of African languages. As part of the 2011 and 2012 MediaEval benchmark campaigns, a number of diverse systems were implemented by independent teams, and submitted to the "Spoken Web Search" task. This paper presents the 2011 and 2012 results, and compares the relative merits and weaknesses of approaches developed by participants, providing analysis and directions for future research, in order to improve voice access to spoken information in low resource settings.
Affiliations:
- Afrique du Sud, Espagne, France, États-Unis
- Catalogne, Pennsylvanie, Région Bretagne
- Barcelone, Pittsburgh, Rennes
- Université Carnegie-Mellon
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000314
- to stream PascalFrancis, to step Curation: 004588
- to stream PascalFrancis, to step Checkpoint: 000669
- to stream Main, to step Merge: 005592
- to stream Main, to step Curation: 005310
- to stream Main, to step Exploration: 005310
- to stream France, to step Extraction: 000322
Links to Exploration step
Francis:15-0046601Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Language independent search in MediaEval's Spoken Web Search task : Spoken Content Retrieval</title>
<author><name sortKey="Metze, Florian" sort="Metze, Florian" uniqKey="Metze F" first="Florian" last="Metze">Florian Metze</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Carnegie Mellon University</s1>
<s2>Pittsburgh, PA</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><settlement type="city">Pittsburgh</settlement>
<region type="state">Pennsylvanie</region>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Anguera, Xavier" sort="Anguera, Xavier" uniqKey="Anguera X" first="Xavier" last="Anguera">Xavier Anguera</name>
<affiliation wicri:level="3"><inist:fA14 i1="03"><s1>Telefonica Research</s1>
<s2>Barcelona</s2>
<s3>ESP</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Espagne</country>
<placeName><settlement type="city">Barcelone</settlement>
<region nuts="2" type="region">Catalogne</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Barnard, Etienne" sort="Barnard, Etienne" uniqKey="Barnard E" first="Etienne" last="Barnard">Etienne Barnard</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>North-West University</s1>
<s2>Vanderbijlpark</s2>
<s3>ZAF</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Afrique du Sud</country>
<wicri:noRegion>North-West University</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Davel, Marelie" sort="Davel, Marelie" uniqKey="Davel M" first="Marelie" last="Davel">Marelie Davel</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>North-West University</s1>
<s2>Vanderbijlpark</s2>
<s3>ZAF</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Afrique du Sud</country>
<wicri:noRegion>North-West University</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Gravier, Guillaume" sort="Gravier, Guillaume" uniqKey="Gravier G" first="Guillaume" last="Gravier">Guillaume Gravier</name>
<affiliation wicri:level="3"><inist:fA14 i1="04"><s1>CNRS-IRISA</s1>
<s2>Rennes</s2>
<s3>FRA</s3>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region">Région Bretagne</region>
<region type="old region">Région Bretagne</region>
<settlement type="city">Rennes</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">15-0046601</idno>
<date when="2014">2014</date>
<idno type="stanalyst">FRANCIS 15-0046601 INIST</idno>
<idno type="RBID">Francis:15-0046601</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000314</idno>
<idno type="wicri:Area/PascalFrancis/Curation">004588</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000669</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000669</idno>
<idno type="wicri:doubleKey">0885-2308:2014:Metze F:language:independent:search</idno>
<idno type="wicri:Area/Main/Merge">005592</idno>
<idno type="wicri:Area/Main/Curation">005310</idno>
<idno type="wicri:Area/Main/Exploration">005310</idno>
<idno type="wicri:Area/France/Extraction">000322</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Language independent search in MediaEval's Spoken Web Search task : Spoken Content Retrieval</title>
<author><name sortKey="Metze, Florian" sort="Metze, Florian" uniqKey="Metze F" first="Florian" last="Metze">Florian Metze</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Carnegie Mellon University</s1>
<s2>Pittsburgh, PA</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><settlement type="city">Pittsburgh</settlement>
<region type="state">Pennsylvanie</region>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author><name sortKey="Anguera, Xavier" sort="Anguera, Xavier" uniqKey="Anguera X" first="Xavier" last="Anguera">Xavier Anguera</name>
<affiliation wicri:level="3"><inist:fA14 i1="03"><s1>Telefonica Research</s1>
<s2>Barcelona</s2>
<s3>ESP</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Espagne</country>
<placeName><settlement type="city">Barcelone</settlement>
<region nuts="2" type="region">Catalogne</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Barnard, Etienne" sort="Barnard, Etienne" uniqKey="Barnard E" first="Etienne" last="Barnard">Etienne Barnard</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>North-West University</s1>
<s2>Vanderbijlpark</s2>
<s3>ZAF</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Afrique du Sud</country>
<wicri:noRegion>North-West University</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Davel, Marelie" sort="Davel, Marelie" uniqKey="Davel M" first="Marelie" last="Davel">Marelie Davel</name>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>North-West University</s1>
<s2>Vanderbijlpark</s2>
<s3>ZAF</s3>
<sZ>3 aut.</sZ>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>Afrique du Sud</country>
<wicri:noRegion>North-West University</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Gravier, Guillaume" sort="Gravier, Guillaume" uniqKey="Gravier G" first="Guillaume" last="Gravier">Guillaume Gravier</name>
<affiliation wicri:level="3"><inist:fA14 i1="04"><s1>CNRS-IRISA</s1>
<s2>Rennes</s2>
<s3>FRA</s3>
<sZ>5 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region">Région Bretagne</region>
<region type="old region">Région Bretagne</region>
<settlement type="city">Rennes</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Computer speech & language : (Print)</title>
<title level="j" type="abbreviated">Comput. speech lang. : (Print)</title>
<idno type="ISSN">0885-2308</idno>
<imprint><date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Computer speech & language : (Print)</title>
<title level="j" type="abbreviated">Comput. speech lang. : (Print)</title>
<idno type="ISSN">0885-2308</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Assessment</term>
<term>Computational linguistics</term>
<term>Information retrieval</term>
<term>Speech processing</term>
<term>Speech recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Traitement automatique de la parole</term>
<term>Evaluation</term>
<term>Linguistique informatique</term>
<term>Recherche d'information</term>
<term>Reconnaissance de la parole</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">In this paper, we describe several approaches to language-independent spoken term detection and compare their performance on a common task, namely "Spoken Web Search". The goal of this part of the MediaEval initiative is to perform low-resource language-independent audio search using audio as input. The data was taken from "spoken web" material collected over mobile phone connections by IBM India as well as from the LWAZI corpus of African languages. As part of the 2011 and 2012 MediaEval benchmark campaigns, a number of diverse systems were implemented by independent teams, and submitted to the "Spoken Web Search" task. This paper presents the 2011 and 2012 results, and compares the relative merits and weaknesses of approaches developed by participants, providing analysis and directions for future research, in order to improve voice access to spoken information in low resource settings.</div>
</front>
</TEI>
<affiliations><list><country><li>Afrique du Sud</li>
<li>Espagne</li>
<li>France</li>
<li>États-Unis</li>
</country>
<region><li>Catalogne</li>
<li>Pennsylvanie</li>
<li>Région Bretagne</li>
</region>
<settlement><li>Barcelone</li>
<li>Pittsburgh</li>
<li>Rennes</li>
</settlement>
<orgName><li>Université Carnegie-Mellon</li>
</orgName>
</list>
<tree><country name="États-Unis"><region name="Pennsylvanie"><name sortKey="Metze, Florian" sort="Metze, Florian" uniqKey="Metze F" first="Florian" last="Metze">Florian Metze</name>
</region>
</country>
<country name="Espagne"><region name="Catalogne"><name sortKey="Anguera, Xavier" sort="Anguera, Xavier" uniqKey="Anguera X" first="Xavier" last="Anguera">Xavier Anguera</name>
</region>
</country>
<country name="Afrique du Sud"><noRegion><name sortKey="Barnard, Etienne" sort="Barnard, Etienne" uniqKey="Barnard E" first="Etienne" last="Barnard">Etienne Barnard</name>
</noRegion>
<name sortKey="Davel, Marelie" sort="Davel, Marelie" uniqKey="Davel M" first="Marelie" last="Davel">Marelie Davel</name>
</country>
<country name="France"><region name="Région Bretagne"><name sortKey="Gravier, Guillaume" sort="Gravier, Guillaume" uniqKey="Gravier G" first="Guillaume" last="Gravier">Guillaume Gravier</name>
</region>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Amérique/explor/PittsburghV1/Data/France/Analysis
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000322 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/France/Analysis/biblio.hfd -nk 000322 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Amérique |area= PittsburghV1 |flux= France |étape= Analysis |type= RBID |clé= Francis:15-0046601 |texte= Language independent search in MediaEval's Spoken Web Search task : Spoken Content Retrieval }}
This area was generated with Dilib version V0.6.38. |